MLIC IMLI
This page contains my reading notes on
- MLIC: A MaxSAT-Based Framework for Learning Interpretable Classification Rules
- IMLI: An Incremental Framework for MaxSAT-Based Learning of Interpretable Classification Rules
Satisfiability problem (SAT)
Let \{C_{1}, \dots, C_{m}\} be a set of Boolean clauses on variables x_{1}, \dots, x_{n} where each clause is a disjunction of literals, each literal being a Boolean variable or its negation. SAT is the problem of finding an assignment of the boolean variables that makes all clauses true
Maximum satisfiability problem (MaxSAT)
Let there be a nonnegative weight W(C) = w_{c} associated with each clause. MaxSAT is the problem of finding an assignment of the boolean variables that maximizes the total weight of the satisfied clauses.
Partial MaxSAT problem
Let \{C_{1}, \dots, C_{m}\} consist of soft clauses and hard clauses. Let there be a nonnegative weight W(C) = w_{c} associated with each soft clause. Partial MaxSAT is the problem of finding an assignment to the boolean variables that makes all hard clauses true and maximizes the total weight of the satisfied soft clauses.
Problem formulation
Inputs:
- Binary matrix \mathbf{X} \in \mathbb{R}^{n \times m} of n instances with m binary features.
- Binary vector \mathbf{y} \in \mathbb{R}^{n} of n binary labels.
- Integer k indicating the number of clauses in the CNF rule
- Regularization parameter \lambda.
Outputs:
- a CNF rule.
Variables:
- k \times m binary matrix \mathbf{B} of the variables \{b_{1, 1}, b_{1, 2}, \dots, b_{1, m}, \dots, b_{k, m}\} that represents the result CNF rule.
- \mathbf{B}_{l, j} = 1 means that the feature f_{j} appears in clause C_{i} of the CNF rule.
- \mathbf{B}_{l} means the lth row of the matrix \mathbf{B}, which is also the lth clause C_{l}.
- n binary noise variables \{\eta_{1}, \dots, \eta_{n}\} that indicates the instances that can be treated as noise.
- If \eta_{i} = 1, the results CNF rule doesn’t have to classifies the \mathbf{x}_{i} correctly.
Partial MaxSAT formulation:
CNF constraint
Q = \bigwedge_{i=1}^{n} N_{i} \wedge \bigwedge_{i=1, j=1}^{i=k, j=m} V_{i, j} \wedge \bigwedge_{i=1}^{n} D_{i}
We want the training accuracy to be higher. Each non-noise will have a \lambda weight. (soft clauses)
N_{i} = \neg \eta_{i} \quad W(N_{i}) = \lambda
We want the CNF rule to be sparse. Each don’t care literal will have a 1 weight. (soft clauses)
V_{l, j} = \neg \mathbf{B}_{l, j} \quad W(V_{l, j}) = 1
We want each non-noise instance to be correctly classified by the CNF rule. (hard clauses)
D_{i} = \left( \neg \eta_{q} \rightarrow \left( y_{i} \leftrightarrow \bigwedge_{l = 1}^{k} \left( \bigvee_{j=1}^{m} \mathbf{X}_{i, j} \wedge \mathbf{B}_{l, j} \right) \right) \right) \quad W(D_{i}) = \infty